The Effect of Bucket Size Tuning in the Dynamic Hybrid GRACE Hash Join Method

نویسندگان

Masaru Kitsuregawa

Masaya Nakayama

Mikio Takagi

چکیده

In this paper, we show detailed analysis and performance evaluation of the Dynamic Hybrid GRACE Hash Join Method (DHGH Method) when the tuple distribution in buckets is unbalanced. The conventional Hash Join Methods specify the tuple distribution in buckets statically. However it may differ from estimation since join operations are applied with selection operations. When the tuple distribution in buckets is unbalanced, the processing cost of join operation becomes more costly than the ideal case when you use Hybrid Hash Join Method (HH Method). On the other hand, when you use the DHGH Method, the destaging buckets are selected dynamically, gives the same performance as the ideal case even if the tuple distribution in buckets is unbalanced such as Zipf-like distributions. We analyze the total I/O cost of a join operation at various number of buckets. The result shows that we have to determine the number of buckets baaed on the tuple distribution in buckets rather than the size of the source relation. It is shown that we had better partition the source relation using a large number of small buckets instead of the smaller number of buckets almost filling the whole main memory adopted in the HH Method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hash-Partitioned Join Method Using Dynamic Destaging Strategy

1 Introduction In this paper we propose a new hash-partitioned join method using a dynamic de&aging strategy for large scale databases. The traditional hash-partitioned join methods such as the Hybrid Hash Join Method assume that the size of each bucket can be controlled by selecting a split function, and the characteristics of the buckets are statically specified. For materializing this assump...

متن کامل

On a Three-Way Hash Join Algorithm

We develop hash-based algorithms for computing a three-way join. The method involves hashing all three relations into buckets, and then joining buckets in main memory, three buckets at a time. Comparing to two-cascaded hash joins, the algorithms avoid materializing an intermediate result. We present a cost model for this approach, from which we identify the range of parameters for queries that ...

متن کامل

Adapting Hash Joins For Modern Processors

Hash join algorithms are crucial to the performance of modern database systems. Conventional hash joins exhibit poor memory system performance on modern processors because their key data structure, the bucket-chain hash table, is ill-suited for the performance characteristics of out-of-order processors with large cache hierarchies. Whereas prior research has considered a variety of optimization...

متن کامل

Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC)

The Super Database Computer (SDC) is a highperformance relational database server for a joinintensive environment under development at University of Tokyo. SDC is designed to execute a join in a highly parallel way. Compared to other join algorithms, a hash-based algorithm is quite efficient and easily parallelieed, and has been employed by many database machines. However, in the presence of da...

متن کامل

Towards Eliminating Random 1 / 0 in Hash Joins

The widening performance gap between CPU and disk is significant for hash join performance. Most current hash join methods try t o reduce the volume of data transferred between memory and disk. In this paper, we try to reduce hash-join times b y reducing random I/O. We study how current algorithms incur random I/O, and propose a new hash join method, Seq+, that converts much of the random 1/0 t...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1989

The Effect of Bucket Size Tuning in the Dynamic Hybrid GRACE Hash Join Method

نویسندگان

چکیده

منابع مشابه

Hash-Partitioned Join Method Using Dynamic Destaging Strategy

On a Three-Way Hash Join Algorithm

Adapting Hash Joins For Modern Processors

Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC)

Towards Eliminating Random 1 / 0 in Hash Joins

عنوان ژورنال:

اشتراک گذاری